Data Mining for Software Defect Prediction with Open Source and Commercial Softwares
نویسنده
چکیده
There has been an emerging open source data mining software development in current times. In this work I carry out an empirical study to compare the performances of open source and commercial data mining software. The results clearly reveals that before and after data preprocessing of the two NASA datasets used in this work, commercial data mining software (MATLAB) outperforms open source data mining software (WEKA) in overall accuracy, probability of false alarm, Area Uuder ROC Curve and Probability of Detection.
منابع مشابه
ارزیابی نرمافزارهای جامع کد منبع باز مدیریت کتابخانه: تحلیل مقایسهای PhpMyLibrary و Koha
Open source softwares are those which permit execution, copy, read, distribution, and improvement of the software without any restrictions. Also, automatic library systems can manage library functions. Commercial library sofwares are very expensive. Therefore, open source softwares can be appropriate alternatives for automatic library systems. In addition to providing the general concept of sou...
متن کاملAntecedents of open source software defects: A data mining approach to model formulation, validation and testing
This paper develops tests and validates a model for the antecedents of open source software (OSS) defects, using Data and Text Mining. The public archives of OSS projects are used to access historical data on over 5,000 active and mature OSS projects. Using domain knowledge and exploratory analysis, a wide range of variables is identified from the process, product, resource, and end-user charac...
متن کاملSoftware defect prediction using relational association rule mining
This paper focuses on the problem of defect prediction, a problem of major importance during software maintenance and evolution. It is essential for software developers to identify defective software modules in order to continuously improve the quality of a software system. As the conditions for a software module to have defects are hard to identify, machine learning based classification models...
متن کاملComparison of Open Source Learning Management Softwares and Presenting a Native Evaluation Tool
Introduction: Nowadays all educational institutes are trying to use technology in their structure. This effort has been faced with different barriers, including cost, time, and support. Therefore, using open source softwares can partially help us in using technology. In this article, we review main features of several open source learning management softwares, while presenting a tool which incl...
متن کاملIs"Better Data"Better than"Better Data Miners"? (On the Benefits of Tuning SMOTE for Defect Prediction)
We report and fix an important systematic error in prior studies that ranked classifiers for software analytics. Those studies did not (a) assess classifiers on multiple criteria and they did not (b) study how variations in the data affect the results. Hence, this paper applies (a) multi-criteria tests while (b) fixing the weaker regions of the training data (using SMOTUNED, which is a self-tun...
متن کامل